First, we will install and load packages that we need for this tutorial. In order to use the tidycensus package, you will need to sign up for a Census API key: https://api.census.gov/data/key_signup.html
#install.packages("tidyverse")
#install.packages("readr")
#install.packages("sf")
#install.packages("tidycensus")
#install.packages("patchwork")
library(tidyverse)
## Warning: package 'ggplot2' was built under R version 4.3.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readr)
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(tidycensus)
library(ggspatial) # north arrow and scale bar
library(patchwork) # combine multiple maps
# library(ipumsr)
# census_api_key("INSERT API KEY", install = TRUE)
#this is christina's but delete later
# census_api_key("0d1515194f46f009f34f94afcbf045315abdbfbd", install = TRUE)
When referring to a new function for the first time, we will use the
double colon operator (::) to specify the source package of
the function. For example, dplyr::filter() refers to the
filter() function from the dplyr package, which is used to
subset rows in a data frame that meets certain conditions (i.e., where
Species is setosa).
data(iris) # This loads the iris dataset
dplyr::filter(iris, Species == 'setosa')
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
## 7 4.6 3.4 1.4 0.3 setosa
## 8 5.0 3.4 1.5 0.2 setosa
## 9 4.4 2.9 1.4 0.2 setosa
## 10 4.9 3.1 1.5 0.1 setosa
## 11 5.4 3.7 1.5 0.2 setosa
## 12 4.8 3.4 1.6 0.2 setosa
## 13 4.8 3.0 1.4 0.1 setosa
## 14 4.3 3.0 1.1 0.1 setosa
## 15 5.8 4.0 1.2 0.2 setosa
## 16 5.7 4.4 1.5 0.4 setosa
## 17 5.4 3.9 1.3 0.4 setosa
## 18 5.1 3.5 1.4 0.3 setosa
## 19 5.7 3.8 1.7 0.3 setosa
## 20 5.1 3.8 1.5 0.3 setosa
## 21 5.4 3.4 1.7 0.2 setosa
## 22 5.1 3.7 1.5 0.4 setosa
## 23 4.6 3.6 1.0 0.2 setosa
## 24 5.1 3.3 1.7 0.5 setosa
## 25 4.8 3.4 1.9 0.2 setosa
## 26 5.0 3.0 1.6 0.2 setosa
## 27 5.0 3.4 1.6 0.4 setosa
## 28 5.2 3.5 1.5 0.2 setosa
## 29 5.2 3.4 1.4 0.2 setosa
## 30 4.7 3.2 1.6 0.2 setosa
## 31 4.8 3.1 1.6 0.2 setosa
## 32 5.4 3.4 1.5 0.4 setosa
## 33 5.2 4.1 1.5 0.1 setosa
## 34 5.5 4.2 1.4 0.2 setosa
## 35 4.9 3.1 1.5 0.2 setosa
## 36 5.0 3.2 1.2 0.2 setosa
## 37 5.5 3.5 1.3 0.2 setosa
## 38 4.9 3.6 1.4 0.1 setosa
## 39 4.4 3.0 1.3 0.2 setosa
## 40 5.1 3.4 1.5 0.2 setosa
## 41 5.0 3.5 1.3 0.3 setosa
## 42 4.5 2.3 1.3 0.3 setosa
## 43 4.4 3.2 1.3 0.2 setosa
## 44 5.0 3.5 1.6 0.6 setosa
## 45 5.1 3.8 1.9 0.4 setosa
## 46 4.8 3.0 1.4 0.3 setosa
## 47 5.1 3.8 1.6 0.2 setosa
## 48 4.6 3.2 1.4 0.2 setosa
## 49 5.3 3.7 1.5 0.2 setosa
## 50 5.0 3.3 1.4 0.2 setosa
In this tutorial, we will use pipes (%>%)
from the magrittr package in order to clearly express sequences of
steps. Piping allows us to read the code from top –> bottom rather
than inside –> out.
# Without pipes
head(dplyr::select(dplyr::filter(iris, Species == 'setosa'), Sepal.Length, Sepal.Width))
## Sepal.Length Sepal.Width
## 1 5.1 3.5
## 2 4.9 3.0
## 3 4.7 3.2
## 4 4.6 3.1
## 5 5.0 3.6
## 6 5.4 3.9
# With pipes
iris %>%
dplyr::filter(Species == 'setosa') %>%
dplyr::select(Sepal.Length, Sepal.Width) %>%
head()
## Sepal.Length Sepal.Width
## 1 5.1 3.5
## 2 4.9 3.0
## 3 4.7 3.2
## 4 4.6 3.1
## 5 5.0 3.6
## 6 5.4 3.9
You can get a list of variable names and descriptions from the U.S.
Census using the load_variables() function from the
tidycensus package. Since we are interested in the year 2021, let’s use
the American Community Survey (vs. the Decennial Census data). We will
use the 5-year ACS estimates and filter to variables that are available
at the Census tract level.
variables_2021_tract <- tidycensus::load_variables(year = 2021, dataset = "acs5", cache = TRUE) %>%
dplyr::filter(geography == "tract")
variables_2021_tract %>% head()
## # A tibble: 6 × 4
## name label concept geography
## <chr> <chr> <chr> <chr>
## 1 B01001A_001 Estimate!!Total: SEX BY AGE (WHI… tract
## 2 B01001A_002 Estimate!!Total:!!Male: SEX BY AGE (WHI… tract
## 3 B01001A_003 Estimate!!Total:!!Male:!!Under 5 years SEX BY AGE (WHI… tract
## 4 B01001A_004 Estimate!!Total:!!Male:!!5 to 9 years SEX BY AGE (WHI… tract
## 5 B01001A_005 Estimate!!Total:!!Male:!!10 to 14 years SEX BY AGE (WHI… tract
## 6 B01001A_006 Estimate!!Total:!!Male:!!15 to 17 years SEX BY AGE (WHI… tract
The str_detect() function from the stringr package
allows us to search text strings for key words. We can use it to find
the variable names related to median household income and educational
attainment.
variables_2021_tract %>%
dplyr::filter(concept %>% stringr::str_detect('MEDIAN INCOME'))
## # A tibble: 25 × 4
## name label concept geography
## <chr> <chr> <chr> <chr>
## 1 B06011_001 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 2 B06011_002 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 3 B06011_003 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 4 B06011_004 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 5 B06011_005 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 6 B07011_001 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 7 B07011_002 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 8 B07011_003 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 9 B07011_004 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## 10 B07011_005 Estimate!!Median income in the past 12 months -… MEDIAN… tract
## # ℹ 15 more rows
variables_2021_tract %>%
dplyr::filter(concept %>% stringr::str_detect('EDUCATION'))
## # A tibble: 498 × 4
## name label concept geography
## <chr> <chr> <chr> <chr>
## 1 B06009_001 Estimate!!Total: PLACE … tract
## 2 B06009_002 Estimate!!Total:!!Less than high school graduate PLACE … tract
## 3 B06009_003 Estimate!!Total:!!High school graduate (include… PLACE … tract
## 4 B06009_004 Estimate!!Total:!!Some college or associate's d… PLACE … tract
## 5 B06009_005 Estimate!!Total:!!Bachelor's degree PLACE … tract
## 6 B06009_006 Estimate!!Total:!!Graduate or professional degr… PLACE … tract
## 7 B06009_007 Estimate!!Total:!!Born in state of residence: PLACE … tract
## 8 B06009_008 Estimate!!Total:!!Born in state of residence:!!… PLACE … tract
## 9 B06009_009 Estimate!!Total:!!Born in state of residence:!!… PLACE … tract
## 10 B06009_010 Estimate!!Total:!!Born in state of residence:!!… PLACE … tract
## # ℹ 488 more rows
We can also use the RStudio IDE to search the variables.
variables_2021_tract %>% view()
To download ACS data using the get_acs() function, we
need three pieces of information.
First, what do we want our geography to be? What’s our unit of analysis? For this we have decided to use the Census tracts of New York State. See a full list of available geographies here: https://walker-data.com/tidycensus/articles/basic-usage.html#geography-in-tidycensus
Second, what variables do we want? We can pull those directly in using the same command line. We can even give it our own titles.
Third, what year of data do we want? There’s room for that too!
Let’s use median income in the past 12 months (B06011_001) and educational attainment (B06009_001, B06009_002, B06009_003, B06009_004, B06009_005).
acs_data <- get_acs(geography = "tract",
variables = c(hhincome = "B06011_001",
total_education = "B06009_001",
education_lessthanhs = "B06009_002",
education_hs = "B06009_003",
education_somecollege = "B06009_004",
education_bachelors = "B06009_005"),
state = "NY",
year = 2021,
output = "wide")
## Getting data from the 2017-2021 5-year ACS
# Convert educational attainment counts to proportions
acs_data <- acs_data %>%
mutate(
perc_lessthanhs = (education_lessthanhsE / total_educationE),
perc_hs = (education_hsE / total_educationE),
perc_somecollege = (education_somecollegeE / total_educationE),
perc_bachelors = (education_bachelorsE / total_educationE)
)
#the above command brings in the data as a dataframe, which means there is no geometry attached
class(acs_data)
## [1] "tbl_df" "tbl" "data.frame"
Let’s begin by exploring the data. We can do that with summary() or a variety of data visualization techniques.
A histogram visually represents the distribution of a dataset by grouping data into bins and displaying the frequency of data points within each bin.
#Explore data
summary(acs_data[, c("perc_lessthanhs", "perc_hs", "perc_somecollege", "perc_bachelors")])
## perc_lessthanhs perc_hs perc_somecollege perc_bachelors
## Min. :0.00000 Min. :0.0000 Min. :0.0000 Min. :0.0000
## 1st Qu.:0.05360 1st Qu.:0.1922 1st Qu.:0.1892 1st Qu.:0.1301
## Median :0.09948 Median :0.2676 Median :0.2495 Median :0.1919
## Mean :0.12877 Mean :0.2622 Mean :0.2469 Mean :0.2035
## 3rd Qu.:0.17491 3rd Qu.:0.3357 3rd Qu.:0.3064 3rd Qu.:0.2640
## Max. :1.00000 Max. :1.0000 Max. :1.0000 Max. :0.7143
## NA's :115 NA's :115 NA's :115 NA's :115
#Histogram for distribution of pooulation with less than high school education across census tracts
plot_hist <- ggplot(acs_data, aes(x = perc_lessthanhs)) +
geom_histogram(binwidth = 0.03, fill = "lightblue", color = "black", alpha = 0.7) +
theme_minimal() +
labs(title = "Distribution of Population with Less Than High School Education",
x = "Percentage",
y = "Frequency")
plot_hist
## Warning: Removed 115 rows containing non-finite outside the scale range
## (`stat_bin()`).
A box plot, or box-and-whisker plot, displays the distribution of a dataset by showing its median, quartiles, and potential outliers.
# Boxplot for percentage of population with a high school degree across census tracts
plot_box <- ggplot(acs_data, aes(y = perc_lessthanhs)) +
geom_boxplot(fill = "blue", color = "black", alpha = 0.5) +
theme_minimal() +
labs(title = "Boxplot of Population with Less Than High School Education",
y = "Percentage")
plot_box
## Warning: Removed 115 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
A bar plot represents categorical data with rectangular bars, where the height or length of each bar is proportional to the value it represents.
#Bar plots for percentages
# Reshape the data for easier plotting
education_data <- acs_data %>%
select(GEOID, perc_lessthanhs, perc_hs, perc_somecollege, perc_bachelors) %>%
pivot_longer(cols = starts_with("perc_"), names_to = "education_level", values_to = "proportion")
print(head(education_data))
## # A tibble: 6 × 3
## GEOID education_level proportion
## <chr> <chr> <dbl>
## 1 36001000100 perc_lessthanhs 0.246
## 2 36001000100 perc_hs 0.283
## 3 36001000100 perc_somecollege 0.234
## 4 36001000100 perc_bachelors 0.155
## 5 36001000201 perc_lessthanhs 0.117
## 6 36001000201 perc_hs 0.223
summary(education_data$proportion)
## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## 0.0000 0.1212 0.2096 0.2103 0.2900 1.0000 460
# Bar plot for educational attainment percentages
plot_bar <- education_data %>%
dplyr::group_by(education_level) %>%
dplyr::summarize(mean_proportion = mean(proportion, na.rm = TRUE)) %>%
ggplot(aes(x = education_level, y = mean_proportion, fill = education_level)) +
geom_bar(stat = 'identity', color = "black", alpha = 0.7) +
theme_minimal() +
labs(title = "Educational Attainment Proportions",
x = "Education Level",
y = "Proportion") +
theme(legend.position = "none")
plot_bar
Now first we downloaded the ACS data using tidycensus, but it was just a dataframe. This means there was no spatial component attached. The good news about get_acs() is that we can also ask it to bring in geometries – an essential tool to make a map!
#to also bring in geometries, modify the command to bring in geometries
acs_data_geo <- get_acs(
geography = "tract",
variables = c(hhincome = "B19013_001",
total_education = "B16010_001",
education_lessthanhs= "B16010_002",
education_hs = "B16010_015",
education_somecollege = "B16010_028",
education_bachelors = "B16010_041"),
state = "NY",
year = 2021,
output = "wide",
geometry = TRUE) %>%
dplyr::transmute(
GEOID = GEOID,
hhincome = hhincomeE,
perc_lessthanhs = (education_lessthanhsE / total_educationE),
perc_hs = (education_hsE / total_educationE),
perc_somecollege = (education_somecollegeE / total_educationE),
perc_bachelors = (education_bachelorsE / total_educationE)
)
## Getting data from the 2017-2021 5-year ACS
## Downloading feature geometry from the Census website. To cache shapefiles for use in future sessions, set `options(tigris_use_cache = TRUE)`.
##
|
| | 0%
|
|= | 1%
|
|= | 2%
|
|== | 3%
|
|=== | 4%
|
|==== | 5%
|
|==== | 6%
|
|===== | 7%
|
|====== | 8%
|
|====== | 9%
|
|======= | 10%
|
|======== | 11%
|
|======== | 12%
|
|========= | 13%
|
|=========== | 16%
|
|============ | 17%
|
|============= | 19%
|
|============== | 20%
|
|============== | 21%
|
|================ | 22%
|
|================ | 23%
|
|======================= | 32%
|
|============================= | 42%
|
|=================================== | 51%
|
|========================================== | 60%
|
|================================================ | 69%
|
|======================================================= | 78%
|
|============================================================= | 87%
|
|==================================================================== | 97%
|
|======================================================================| 100%
#using base R to plot
#NY state
plot(st_geometry(acs_data_geo), col = NA, border = 'grey')
# Define the years of interest
years <- 2009:2021
#define the variables from above
variables <- c(hhincome = "B19013_001")
#download data for multiple years
# Function to download ACS data for a given year
get_acs_data <- function(year) {
get_acs(
geography = "tract",
variables = variables,
state = "NY",
year = year,
output = "wide",
geometry = TRUE
) %>%
mutate(year = year) # Add a column for the year
}
# Download data for all years and combine into one data frame
acs_data_list <- lapply(years, get_acs_data)
acs_data_all_years <- bind_rows(acs_data_list)
View(acs_data_all_years)
# Check the structure of the combined data
str(acs_data_all_years)
#plotting median household income over the years
#there are over 4,000 census tracts so let's aggregate and plot averages
average_income <- acs_data_all_years %>%
group_by(year) %>%
summarise(avg_hhincome=mean(hhincomeE, na.rm=TRUE))
income_overtime <- ggplot(average_income, aes(x=year, y=avg_hhincome))+
geom_line(color="blue") +
geom_point(color="blue")+
theme_minimal()+
labs(title="Average Median Household Income Over Time in NY (2009-2021)",
x="Year",
y= "Average Median Household Income")
income_overtime
You can also map these! But we’ll do this later.
The PLACES data was prepared by the Centers for Disease Control and Prevention (CDC), Robert Wood Johnson Foundation, and the CDC Foundation.
places_prelim_df <- readr::read_csv("~/Library/CloudStorage/OneDrive-ColumbiaUniversityIrvingMedicalCenter/PhD Admin/Projects Collaborations/R Consortium with Stephen/places2023.csv")
## Rows: 2555113 Columns: 23
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (16): StateAbbr, StateDesc, CountyName, CountyFIPS, LocationName, DataSo...
## dbl (5): Year, Data_Value, Low_Confidence_Limit, High_Confidence_Limit, Tot...
## lgl (2): Data_Value_Footnote_Symbol, Data_Value_Footnote
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
places_prelim_df %>% head() %>% View()
places_cvd <- places_prelim_df %>%
dplyr::filter(MeasureId == 'CHD' & StateAbbr == 'NY') %>% # measureid
dplyr::transmute(GEOID = LocationName,
state = StateAbbr, # stateabbr
chd_prev = Data_Value/100) # data_value
merged <- acs_data_geo %>%
dplyr::left_join(places_cvd, by = 'GEOID')
merged %>%
skimr::skim(hhincome, perc_lessthanhs, perc_hs, perc_somecollege, perc_bachelors)
| Name | Piped data |
| Number of rows | 5411 |
| Number of columns | 9 |
| _______________________ | |
| Column type frequency: | |
| numeric | 5 |
| ________________________ | |
| Group variables | None |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| hhincome | 213 | 0.96 | 82543.44 | 40423.44 | 2499 | 55325.50 | 74980.00 | 101862.00 | 250001 | ▃▇▃▁▁ |
| perc_lessthanhs | 115 | 0.98 | 0.13 | 0.10 | 0 | 0.05 | 0.10 | 0.17 | 1 | ▇▂▁▁▁ |
| perc_hs | 115 | 0.98 | 0.26 | 0.11 | 0 | 0.19 | 0.27 | 0.34 | 1 | ▃▇▁▁▁ |
| perc_somecollege | 115 | 0.98 | 0.25 | 0.09 | 0 | 0.19 | 0.25 | 0.31 | 1 | ▃▇▁▁▁ |
| perc_bachelors | 115 | 0.98 | 0.36 | 0.20 | 0 | 0.21 | 0.33 | 0.48 | 1 | ▅▇▅▂▁ |
# Income
merged %>%
ggplot(aes(y = hhincome, x = "")) +
geom_violin() +
geom_boxplot(width = 0.2) +
labs(x = "", y = "Median Household Income")
## Warning: Removed 213 rows containing non-finite outside the scale range
## (`stat_ydensity()`).
## Warning: Removed 213 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
# Educational attainment
merged_long_edu <- merged %>%
tidyr::pivot_longer(cols = -c(GEOID, state, geometry, hhincome),
names_to = 'variable_name',
values_to = 'value')
merged_long_edu %>%
ggplot(aes(y = value, x = "")) +
geom_violin() +
geom_boxplot(width = 0.2) +
facet_grid(. ~ variable_name) +
labs(x = "", y = "Proportion")
## Warning: Removed 1533 rows containing non-finite outside the scale range
## (`stat_ydensity()`).
## Warning: Removed 1533 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
Use the ggplot::geom_sf() in order to plot the Census
tract boundaries for New York.
merged %>%
ggplot() +
ggplot2::geom_sf()
A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the value of a variable being represented. This type of map is commonly used to visualize how a measurement varies across a geographic area, such as population density or median income.
In ggplot::geom_sf(), use the fill argument
in the aes() function to specify a variable that you would
like to visualize In our example, we are visualizing the geographic
distribution of those who have a percentage high school education.
smap_1 <- merged %>%
ggplot() +
geom_sf(aes(fill = hhincome))
smap_1
Use the low and high arguments in the
scale_fill_continuous() to specify colors to use for low
and high values on a continuous scale.
smap_2 <- smap_1 +
scale_fill_continuous(low = 'white', high = 'blue')
smap_2
Use the theme_void() function to remove the background
and axis marks.
smap_3 <- smap_2 +
theme_void()
smap_3
Use the title argument in the labs()
function in order to create a title for the map.
smap_4 <- smap_3 +
labs(title = 'High School Educational Attainment in New York')
smap_4
Use the annotation_north_arrow() and
annotation_scale() functions from the ggspatial package to
add in a north arrow and scalebar to the map - best practices when
building a map.
location = tl argument in
annotation_north_arrow() function is used to place the
north arrow on the top left area of the map.unit_category = 'imperial' argument in
annotation_scale() function is used to change the units in
the scale bar from metric to imperial units (showing miles instead of
kilometers).smap_5 <- smap_4 +
ggspatial::annotation_north_arrow(location = 'tl') + # tl for top-left
ggspatial::annotation_scale(unit_category = 'imperial') # imperial units to show miles instead of km
smap_5
We can create the map using the previous functions in one step.
smap <- merged %>%
ggplot() +
ggplot2::geom_sf() +
geom_sf(aes(fill = perc_hs)) +
scale_fill_continuous(low = 'white', high = 'blue') +
theme_void() +
labs(title = 'High School Educational Attainment in New York') +
ggspatial::annotation_north_arrow(location = 'tl') +
ggspatial::annotation_scale(unit_category = 'imperial')
smap
Long datasets are easier to use with ggplot2.
A long dataset, also known as a “tidy” dataset, is structured so that each row represents a single observation, with one column for each variable and an additional column for the values. This format is often used for time series or repeated measures data.
In contrast, a wide dataset has a format where each row represents a single subject or entity, with multiple columns for different measurements or observations taken at different times or under different conditions. This format is common in datasets where comparisons across different time points or conditions are needed.
We use the tidyr::pivot_longer() function in order to
create one column for the values of the CHD and educational attainment
variables in our dataset.
merged_long <- merged %>%
tidyr::pivot_longer(cols = c(chd_prev, perc_lessthanhs, perc_hs, perc_somecollege, perc_bachelors),
names_to = 'variable_name',
values_to = 'value')
Use the facet_wrap() function to create a facet plot
(allows us to create multiple maps in a grid) for our 5 variables. We
use the ncol = 3 argument to specify that we would like to
create a grid of maps with 3 columns.
multmap_1 <- merged_long %>%
ggplot() +
ggplot2::geom_sf() +
geom_sf(aes(fill = value)) +
facet_wrap(vars(variable_name), ncol = 3)
multmap_1
Use the same functions as we used with the single variable choropleth map in order to add different legend colors, theme, a title, a north arrow, and a scalebar.
multmap_2 <- multmap_1 +
scale_fill_continuous(low = 'white', high = 'blue') +
theme_void() +
labs(title = 'High School Educational Attainment in New York') +
ggspatial::annotation_north_arrow(location = 'tl') +
ggspatial::annotation_scale(unit_category = 'imperial')
multmap_2
We can create the map using the previous functions in one step.
multmap <- merged_long %>%
ggplot() +
ggplot2::geom_sf() +
geom_sf(aes(fill = value)) +
facet_wrap(vars(variable_name), ncol = 3) +
scale_fill_continuous(low = 'white', high = 'blue') +
theme_void() +
labs(title = 'High School Educational Attainment in New York') +
ggspatial::annotation_north_arrow(location = 'tl') +
ggspatial::annotation_scale(unit_category = 'imperial')
multmap
Using the patchwork package, we can use the / operator
to stack our previously created maps, one on top of the other.
smap / multmap
Below are some additional references and resources: